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Abstract: This paper investigates the problem of locating a continuous chemical source 
using the concentration measurements provided by a wireless sensor network (WSN). Such 
a problem exists in various applications: eliminating explosives or drugs, detecting the 
leakage of noxious chemicals, etc. The limited power and bandwidth of WSNs have 
motivated collaborative in-network processing which is the focus of this paper. We 
propose a novel distributed least-squares estimation (DLSE) method to solve the chemical 
source localization (CSL) problem using a WSN. The DLSE method is realized by 
iteratively conducting convex combination of the locally estimated chemical source 
locations in a distributed manner. Performance assessments of our method are conducted 
using both simulations and real experiments. In the experiments, we propose a fitting 
method to identify both the release rate and the eddy diffusivity. The results show that the 
proposed DLSE method can overcome the negative interference of local minima and 
saddle points of the objective function, which would hinder the convergence of local search 
methods, especially in the case of locating a remote chemical source. 
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1. Introduction 

At present, advances in chemical sensing technology [1] have made it applicable to maintain a 
wireless network of inexpensive and reliable chemical sensors for environment monitoring. An 
important application of such a WSN is CSL, which consists of not only measuring the concentration 
of the objective chemical substance but also locating the chemical source using the concentration 
measurements. Designing an efficient CSL method may greatly reduce both human injuries and 
financial losses [2]. Compared with mobile sensors, a static WSN can cover a large area at a faster 
speed. This advantage is more remarkable when the target area constrains the mobility of the mobile 
sensors. For example, in the case of locating the source of a smoke cloud at the initial stage of a 
potential conflagration in a mountain, a WSN can be rapidly self-organized by a large number of 
sensor nodes which are deployed from a plane flying over the cloud, while the way of existing mobile 
sensors would be hindered due to the complicated obstacles. 

However, the working time of WSN would be limited if heavy communication burdens are 
imposed, since the sensor nodes are mainly powered up by batteries which cannot be easily charged on 
site. This problem makes the power consumption a key factor during the design of applications based 
on WSNs. To save the limited power, the most commonly used method is to reduce the power 
consumed by wireless communication. Specifically, this target can be realized by utilizing distributed 
methods which avoid transmitting raw measurements to the sink node. Moreover, the robustness of the 
distributed method is better than the centralized one, i.e., for distributed methods, any single node 
having problem does not influence the whole result. 

The problem of CSL using a network of chemical sensors has been intensively investigated [3-10] 
since the seminal work of Nehorai et al. [11]. Among these works, least-squares estimation (LSE) 
based methods require no statistical assumptions about the measurement errors, since the estimate is 
chosen to provide a "best" fit to the observed measurements in a deterministic sense [12,13]. 
Nevertheless, the objective function of LSE based CSL using WSN has multiple local optima and 
saddle points. Thus, the traditionally used search techniques may converge locally to a suboptimal 
solution. Matthes et al. [4] proposed a superior unimodal objective function which can largely reduce 
the difficulty of searching the single maximum located at the real source location. However, the raw 
concentration measurements should be gathered together in order to calculate the objective function. 
This process of gathering the raw measurements at the sink node conflicts with the scheme of 
distributed methods. 

This paper proposes a distributed, global convergent LSE method for solving the problem of CSL 
using WSNs. The problem of minimizing the objective function of LSE based CSL using WSNs, i.e., 
the sum of squared measurement errors, is decomposed into multiple sub-problems of minimizing 
individual summands, each of which is associated with the information of an individual sensor node 
and thus can be solved locally on the associated sensor node. Aiming to minimize an individual 
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summand, we derive a set of probable source locations by substituting the noisy measurement to the 
concentration distribution model. An estimate of the source location is locally determined from each of 
the location sets by empirically maximizing the probability of being the real source location based on 
the prior information about the source location. Then, the problem of CSL using a WSN can be solved 
by comprehensively incorporating the information of these local estimates into a global estimate. We 
consider a convex combination of these local estimates as the global estimate of the source location. 
The convex combination of these local estimates, which are distributed on individual sensor nodes, is 
realized in a distributed manner using the distributed average consensus algorithm proposed in [14]. 
Finally, the above-mentioned process repeats iteratively until the termination condition is satisfied and 
the global estimate is considered as the prior information about the source location for determining the 
local estimates at the next iteration. The proposed method was assessed in both simulations and real 
experiments, including the case of locating a remote chemical source. 

The rest of this paper is organized as follows: the chemical concentration distribution model, the 
sensor-source distance function, and the measurement model are described in Section 2. The DLSE 
method for CSL using WSN is proposed in Section 3. The results of numerical simulations and real 
experiments are presented in Sections 4 and 5, respectively. Some concluding remarks are given in 
Section 6. 

2. Formulation and Preliminaries 

In this section, the advection-diffusion chemical concentration distribution model is introduced, and 
then the distance between the unknown source and the sensor node is represented as a function of their 
relative angle and the theoretical concentration at the sensor node location. Finally, the concentration 
measurement model of the sensor nodes in the WSN is presented. 

2.1. Chemical Concentration Distribution Model 

The advection-diffusion model proposed in [15] is adopted in this paper. In the theory of eddy 
diffusion in the atmosphere, the diffusion rate at a certain point under each pressure level is 
parameterized by the eddy diffusivity, K cm /s. The concentration of the chemical substance, c g/mL, 
can reach a steady state in a homogeneous wind field in which the eddy diffusivity is the same at every 
point [15]. As shown in Figure 1, the continuous chemical source located at point O is releasing at a 
rate, q g/s. The released chemical substance is advected by a homogeneous wind [16], the speed of 
which is V cm/s along the positive x-axis. The sensor, which is represented by point P at {x,y,z), lies in 
the plume caused by the chemical source. The steady concentration at the point {x,y,z) satisfies: 

The concentration distribution is symmetry about the j:-axis due to the homogenous flow, and then 
Equation (1) becomes: 



vcos p 



d V sin /? 5 



dd d dp 



K 

d^ 



d ( ,ydc\ 1 d ^ 



sin B — 

ddy dd) sin pep I dfi 



d'— | + - 



(2) 



where is the angle between OP and OX [15]. 
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Figure 1. Coordinate systems used in this paper, (a) 3-dimensional Cartesian coordinate 
system; the distance between O and P is d cm, the angle between OP and OX is y9. 
(b) 2-dimensional Cartesian coordinate system; the case that P is located on the XOY 
plane, and a is the angle from the positive jc-axis to PO. (c) the polar coordinate system in 
which P is taken as the origin, so that the position of O can be determined based on the 
position of P. 






*■ X 



(c) 



Considering that the concentrations at the source location and the point extremely far away from the 
source (i.e.,d^+x: ) approximately equal +00 and zero, respectively. Then, the solution of 
Equation (2) is: 



cix,y,z) = 



4n:Kd 



exp 



V(i(l-cos/?) 
2K 



(3) 



In the case studied here, the chemical source and the sensor nodes are situated on an impenetrable 
floor. To analyze the concentration distribution on the floor, we first suppose that the chemical source 
is placed at (0,0,h). Since no chemical matters can diffuse through the impenetrable floor, the 
concentration at P can be calculated using the method of mirror images [17] as follows: 



c(x,y,z) = 



47TKd 



7 exp 



vJ'(l-cos/?) 
2K 



47rKd 



V6?"(l-COS/?) 

2K 



(4) 



where d' = -^x^ + +{z-hf and d" = ^x^ + + {z+hf . Then, substituting h = 0 and z = 0 into 
Equation (4) for the concentration in our case, the concentration at P is given by: 



c(x,y) = 



iTtKd 



exp 



V(i(l-cos/?) 
2K 



(5) 



More generally, when the chemical source is located at another point (xo,jo) in the 2-dimensional 
coordinate system, the associated distance will be J = ^J(x-x^y + (y- yo)^ ■ 

To emphasize the importance of the node locations in determining the source location, the location 
of the chemical source is transferred into the polar coordinate system in which the sensor location P is 
taken as the origin, as shown in Figure Ic. Then, the coordinate (xo,yo) is expressed as: 

[xq - x + dcosa (6) 
[yo ~ y + dsina 
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where a e (0, Iti} is the angle from the positive x-axis to the broken line pointing from the measuring 
location to the source. The relationship between a and P is shown in Figure lb. Obviously, a=P+7r 
and Equation (5) becomes: 



q I vd[l-cos,{a- n)] 
c{x,y)= ^ ,^, exp<^ 



InKd 
InKd 



exp 



IK 

vfi?(l + cosa) 



(7) 



The advection-diffusion model introduced above was also referred to as Robert's model in [18]. 
Some of other studies based on this model can be found in [4,17,19,20]. 

2.2. Deriving the Sensor-Source Distance Function 



Based on the concentration distribution model in Equation (7), given the concentration c, flow 
speed V and the parameters K and q, the distance between the chemical source and the sensor node can 
be expressed as a function of a by reverse derivation. 

First, applying the equation cos a = 2cos^ — - 2sin^ — to Equation (7), we get 



c - 




d - exp 



vJcos 
K 



(8) 



Then, moving the denominator in the right-hand-side of Equation (8) to the left-hand side and 

vcos^ I ^ I 

multiplying both sides with , we obtain 



Kc 

vd cos^ 
K 



-exp 



vd cos 



K 



a 



vqcos 

z 

IttK^c 



(9) 



Based on the principle branch of the Lambert W function, i.e., Wq{s),s>Q , which satisfies 
Wo(>s')exp[Wo(5')] = s,s>0 [21], when a^TU , Equation (9) can be rewritten as: 



vJcos 



2 _ 



K 



a 



vqcof, 
z 



(10) 



Finally, the sensor-source distance d can be represented as follows: 



d = 



IttKc 

f 



K 



2 a 

vcos 

2 



a 



vq cos 
z 

InK^c 



others 



(11) 
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Since the principle branch of the Lambert W function is single-valued and monotonically 
increasing, according to Equation (11), d is inversely proportional to c when other variables are 
kept constant. 

2.3. Measurement Model 

Suppose there is a WSN that consists of n chemical sensor nodes randomly deployed at known 
global coordinates X; =(x,, on the same 2-dimensional field as the chemical source 

deployed at Xg =(x,),yo). Note that the boldface lowercase letter denotes the coordinate vector of a 

sensor node, as well as the sensor node itself, which is easy to be distinguished in the rest of this paper. 
After the steady state of concentration distribution is reached, the measurements {z,,/ at the 

same sampling time can be formulated as: 

2,=c,+e,- (12) 

where c. represents the theoretical concentration at the location of jc, , stands for the measurement 

error which is mainly comprised of the errors caused by the uncontrollable drift of the sensors and the 
sensors' response to foreign substances due to the poor selectivity of the commonly used chemical 
sensors [2]. Typically, it can be considered that the error terms for all the sensor nodes at the same 
sampling time satisfies the normal distribution with a positive mean value [11]. Due to the relatively 
small variance of the normal distribution of e.,i e [l,/i] , the value of z,,/ e [1,«] can roughly represent 

its signal-to-noise ratio (SNR), i.e., \{zJe.)-\\ [6,7]. For a better localization performance, a 

measurement threshold is set to eliminate the measurements with comparatively low SNRs. The nodes 
with measurements that are smaller than the threshold z,^ will be scheduled to hibernate, and the 

remaining l{l<n) nodes will participate in the localization. 

3. Distributed Least-Squares Estimation Method for CSL Using WSN 

In the LSE-based CSL using a WSN, the objective function to be minimized is the sum of squares 
of the / participating scattered sensor nodes' measurement errors [22], which is as follows: 

objix^) = YyZi-c.{x^)f, G 9?' (13) 
(=1 

where z,,/ e [1,^] and c^{xQ),i e [1,/] are known measurements and unknown theoretical concentration 
(since x^ is unknown during the localization), respectively. By varying the location of jCq in its 
domain, i.e., jc,, e 91^ , the distribution of this objective function can be obtained. The logarithm of this 

distribution is illustrated in Figure 2. There is a global minimum located at the actual source location, 
while there are some local minima near the global minimum, as well as some saddle points around the 
local maxima. 

As shown in Figure 2, there is a global minimum located at the actual source location. It is 
straightforward that the objective function in Equation (13) would achieve its global minimum, when 
the / summands, i.e., iz^-c^ix^y^ , reach their minima simultaneously. The fact that [z, -c,(JCo)]^ is 
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irrelevant to all the sensor nodes except x, , motivates the idea of designing a local estimation of the 
source location, which is aiming to minimize [z, - c,.(jc„)]^ , for jc, . 

Figure 2. Mesh graph of the objective function in Equation (13); the actual source is 
located at (3,40) m. The unit of the objective function is ppm . Some of the local minima 
and the saddle points are pointed out. More details about the simulation setup are given in 
Section 4. 




After the estimates of these local estimations are achieved, they are convexly combined to form the 
global estimate by using an average consensus algorithm. The overall scheme of the proposed method 
is given in Figure 3, in which the sub-processes are detailed in the following subsections. 

Figure 3. The flow chart of the proposed localization method. The location estimations are 
conducted locally on all of the sensor nodes which have measured above-threshold 
concentration values. x^^{k) is the global estimate at the k-\h iteration of localization. 
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3.1. Local Estimation of the Source Location 

In Section 2, the location of the chemical source has been transformed into the polar coordinate 
system, in which the location of sensor node is taken as the origin. According to Equation (11), when 
a^TT the coordinates of the source location can be further expressed as functions of a as follows: 



^:(l-tan' ^) 



a 



vqcos, 
z 



J 



2^ «T,. 
V 2 



a 



v^cos 
z 

InK^c 



(14) 



For the case that a ;r , the source location is (Xq, y^) = (x 



2nKc 



, y) . However, the calculation of 



Equation (14) involves determining the theoretical concentration c, which is impossible to be obtained 
due to the unknown source location. Since e.,i ^[1,1] are minor compared with C;,/e[l,/], the 

measurement Z;,je[l,/] can be considered as an approximate estimate of c,,/e[l,/] . Then, by 

substituting z, for c in Equation (14) and changing a in its domain (0,2;r], we can determine a set of 

points which are approximate estimates of the source location. The point set corresponding to 
Z;,«e[l,/] is denoted by 5; = {jc,',},/ e [1,/] . In the context of probabilistic inference, the points in 

5; = {jc^},? e [1,/] can be considered as the probable source locations inferred based on 

Compared with the domain of Xq , i.e., jCq e 9?^ , the set S. = {jCq},/ e [1,/] is substantially refined, and 

thus is referred to as the refined sample space (i.e., the set of all possible results) for the subsequent 
local estimation on x. , / e [1, /] . 

Figure 4. The oval formed by all the points in a single refined sample space, i.e., 
S.,iG[l,l]. The cross and the triangle denotes the location of x.,iG[l,l] and the local 

estimate, i.e., x^,/ e [1,/] , respectively. The arrow is the polar axis, the direction of which 

is also the flow direction. 




After Zi is substituted for c. Equation (14) becomes an equation set with a single unknown a whose 
domain is (0,2;r]. Therefore, as shown in Figure 4, the problem of estimating Xq can be transformed 
to another problem of estimating the angle of the vector pointing from x. to Xq , i.e., a'^ . Although it is 
difficult to assign a rigorous theoretical probability of being the actual source location to each point in 
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S.,i e [1,/] , if a predictive location of the source exists, it can be empirically considered that the point 

with the same polar angle as the predictive location has the largest probability of being the actual 
source location among all the points in S.,i e [1,/] . Inspired by the process of iterative estimation, the 

global estimate x^ik — V) is considered as the prior information (predictive location of the source) for 

the local estimations at the k-\h iteration of localization. Let us define d.{k) as the angle between the 

vector jCo(^-l)-x, and the positive direction of the polar axis. Then, a local estimate of the source 

location can be considered as: 

x[,{k) = x[,{Zi{k),d.{k)),i&{\,l] 

d.(k)^angle{xQ(k-l)-x.) ^^^^ 

where x'Q(Zi(k),d.(k)) denotes an estimate of the source location determined by df(k) from the 
refined sample space S. = {jCq},/ e [1,/] calculated using Zj(k) . This process of local estimation can be 
considered as a quasi-maximum a posterior estimation because the maximum posterior probability is 
determined empirically, rather than theoretically, based on the location of Xo(^-l). Note that, at the 
beginning of the localization there is no prior information about the source location. To start the 
localization, an initial point must be set as the prior information for the first iteration of localization. 
Since the initial point has Uttle influence on the localization performance (see Section 4.3), it can be 
randomly initialized. 

3.2. Combining the Local Estimates to Form the Global Estimate 

Section 3.1 presents a framework for the local estimation of the chemical source location based on 
the prior information about the predictive location of the source. However, it does not detail how this 
prior information is achieved and how it is accessible to the participating scattered sensor nodes. 
Moreover, the individual local estimates can only minimize an individual summand of the objective 
function in Equation (13) with the maximum probability. Thus, to comprehensively consider all the 
summands of the objective function in Equation (13), these local estimates are combined to form 
a global estimate, which serves as the prior information about the source location at the next iteration 
of localization. 

3.2.1. Convex Combination of the Local Estimates 

A convex combination of the local estimates at the ^^-th iteration of localization, i.e., XQ(k) , is 
considered as the A:-th global estimate JCo(^) • 

Xo(k) = ^w^(k)x'o(k) (16) 

I 

where w^(k) satisfies ^w.(k) = l and w,(fc)>0 (rendering the combination in Equation (16) as 

(=1 

convex), x'^ik) is calculated using Equation (15). 

To set reasonable weights for the local estimates in Equation (16), let us consider the positional 
relationship between x^ and the points in S.,iG[l,l] from which the local estimates are selected. 
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When the different contaminated measurements z^A e [1,/] are substituted for c in Equation (14), the 
size of the associated ovals formed by the points in different refined sample spaces {i.e., S-,i e [1,/]) 
are different. The oval generated by substituting a larger for c in Equation (14) is smaller than that 
generated using a smaller . This is mainly because d will decrease if c increases and other parameters 

are kept constant according to Equation (11). Moreover, as shown in Figure 5, the points on a smaller 

oval are generally closer to ^" than those on a bigger oval. Therefore, it is reasonable to construct a 
direct proportion between w.(k) and z^ik) to make the sequence of x^ik) approach jc„. The weights 

w.(k) are tentatively set as the proportion of z,(fc) in the sum of all the / measurements, as in 
Equation (17), and achieve good performance in the results presented in Sections 4 and 5: 



W,(k)^^^,iG[l,l] 

Figure 5. The positional relationship between the actual chemical source location and the 
points in different refined sample spaces, i.e., 5;,/e[l,/], which are represented by 

different ovals. The sensor nodes are denoted as crosses. The simulation setup used to 
generate this figure is the same as that used in generating Figure 2, which will be detailed 
in Section 4. 



(17) 




3.2.2. Distributed Average Consensus Algorithm 

It is readily seen that there are two individual accumulating operators in Equation (16) and 
Equation (17). Moreover, the accumulating results, i.e., x^ik) and the denominator in Equation (17), 

should be accessible to all the participating scattered sensor nodes. The centralized processing method 
in which a fusion center gathers the summands from and then transmits the sum back to all the sensor 
nodes is not power-efficient for WSN, and thus is avoided in this paper. Since the sum can be 
considered as the product of the amount and the average of the summands, the process of accumulation 
can be transformed to an averaging process and an additional multiplication between the amount and 
the associated average as follows: 
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X/;(0) = /xAvg{/;(0),/e[l,/]} 



(18) 



where Avg{-} means the averaging operator, and y|o,?e[l,/] are the initial states of the values to be 
averaged. For example, in the case of calculating Equation (16), /(O) = Wj{k)x[^{k),i e [1,/] . Then, the 

distributed average consensus algorithm in [14] can be used to calculate the averaging term in 
Equation (18), which is involved in the convex combination of local estimates twice, i.e., in calculating 
Equations (16) and (17). 

The distributed average consensus algorithm means that only local communications within the 
neighboring set of each participating scattered sensor node are needed during the averaging 
procedure [23]. The steps of adopting this algorithm to our specific problem are elaborated as follows: 

Step 1: Generate the corresponding graph and the Laplacian matrix. Suppose each participating 
scattered sensor node can only communicate with the sensor nodes within its neighboring range. A 
graph describing the corresponding network architecture is generated. In the generated graph, there is 
an edge between two nodes which can communicate with each other. Then, the Laplacian matrix 
(denoted by L) of the graph is calculated as L = AA , where A is the incidence matrix of the graph. 
Since A is a matrix with / rows and g columns, where g is the number of edges in the graph, L is a 
/-dimensional square matrix. 

Figure 6. The associated graph of the mesh architecture in Figure Id. The sensor nodes are 
denoted as circles, in which the numbers are the indexes of the associated nodes. 
Communication links within the neighboring ranges are represented by solid lines, and the 
indexes of these edges are given aside them. 




For example, as shown in Figure 6, there are / = 18 vertexes and g = 32 edges in the graph 
corresponding to the mesh architecture in Figure Id, so the corresponding Laplacian matrix is a 
18-dimensional square matrix. 

Step 2: Calculating the modulus matrix. A modulus matrix, which is denoted by M, is calculated 
as follows: 



M =1 



2 



■L 



A,(L) + A,_,(L) 



(19) 
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where /l.{L),i e [1,/] is the /-th largest eigenvalue of L, and / is the identity matrix. Then, the nonzero 
elements in the j-th row of M, which is denoted by M. , are transmitted to x. . These nonzero elements 
corresponds to x. and its neighboring nodes. For example, in Figure 5, there are three nonzero 
elements in M, , i.e., M^^,M^^ and Mj^ , which correspond to XpXj and x^ , respectively. Since only a 

few nodes are connected with each participating scattered sensor node, the matrix M is sparse and the 
number of the elements to be transmitted is small. 

Step 3: Performing the iteration of averaging. Since the proposed localization algorithm and the 
average consensus algorithm used here are both iterative algorithms, it is important to distinguish them 
from each other. The iteration in the former is called the iteration of localization (counted by k), while 
the iteration in the latter is called the iteration of averaging (counted by t). In the A:-th iteration of 
localization, two whole processes of iterative averaging {i.e., from the initial state to the convergent 

State) should be conducted to calculate ^Zy(fc) in Equation (16) and ^w,.(^)fo(^) in Equation (17). 

7=1 i=l 

In the t-\h iteration of averaging, x.,i&\X,l\ first transmits its own value f.{t),ie:\\,l'\,t>\ (an 
updating version of /j.(0),/ g [1,/] ) to its neighboring nodes, and then updates it by adding up the 

products of the nonzero elements and the values received from its neighbors. For example, according 
to the topology illustrated in Figure 6, f^{t)=M^J^{t-\) + M^^f^{t-\) + M^^f^{t-\) at the Mh 

iteration of averaging. This step actually realizes a matrix multiplication as follows: 

f(t + r) = Mxfit) (20) 

where f(t) is a column vector that consists of f.(t),i ^[1,1]. Note that Equation (20) coincidentally 

involves / local convex combinations on the / participating sensor nodes, which can be distinguished 
from the global convex combination in Equation (16) by analyzing their different summands. The fast 
convergence of Equation (20), which means f(t),i&ll,l] would converge to (X'-i-^;^^))/^ ^ 

iterations of consensus, has been proved in [14]. 

If the variation of the global local estimation, i.e., Ax^^(k) = x^^(k)-XQ(k-l) , has not exceeded a 

predefined range threshold Ax'^ for three iterations of localization, the process of localization will be 

terminated. The sink node should know the indexes and locations of x.,ie:[l,l] for generating the 

corresponding graph, and x.,i should know / and the nonzero elements in M.,/g[1,Z] for 

performing the iterations of consensus. Thus, some information exchanges between the sink and the 
participating scattered sensor nodes are engaged in the first two steps. However, the first two steps are 
conducted only once at the first iteration of localization, i.e., k = I. Therefore, the communication 
burdens in these two steps are acceptable. Basically, there should be some routine communications 
between the sink and the scattered nodes to guarantee the conventional operations of the WSN. 

4. Simulation Results 

In this section, the localization performance of the proposed DLSE method is assessed through 
simulations. First, the basic simulation setup, which is used to generate Figures 2 and 5 and served as 
the prototype of the following simulations, is described. Then, the process of distributed averaging, 
which is the premise of distributed implementation of DLSE, is demonstrated. Afterwards, the 
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localization performance of DLSE is compared with the trust-region-reflective algorithm which is 
recommended for solving centralized nonlinear LSE problem. Finally, the ability of locating a remote 
chemical source using DLSE is assessed. 

4.1. Basic Simulation Setup 

The simulated WSN was composed of 300 nodes that were randomly distributed over an 80 m x 80 m 
square. The lower left corner and bottom margin of this square were taken as the origin and abscissa 
axis of the global Cartesian coordinate system, respectively. In Figures 2 and 5, the chemical source 
was located at (300,4000) cm, and the chemical substances released from the source was diffused by 

4 2 

the flow with a near-surface velocity of v = 70 cm/s. The eddy diffusivity, K, was set as 10 cm 7s. The 
unit of the release rate in Robert's original paper [15] is g/s, and thus the associated unit of 
concentration is g/cm . At 20 °C and standard atmospheric pressure, the mass concentration in g/cm 
can be transferred to the volume concentration in the most commonly used unit, ppm, by multiplying 
an exponential (22.4/M) x lO' , where M is the molecular weight of the released substance. The release 

rate q in our simulations was set as 1 14 g/s, which can cause the concentration range up to dozens of 
ppm when M = 46 (i.e., the molecular weight of ethanol) and the sensor-source distance is tens of 
meters. Referring to [7], the mean and standard deviation of the measurement errors were 
10 kg/m and 8 x 10 kg/m . Correspondingly, the concentration threshold was set as 10 kg/m to 
eliminate the measurements with extremely low SNRs. 

By setting a presumed chemical source at the center of each of the mesh grids on the 
node-deployment square, which is denoted as x,, , a group of c',/e[l,/] can be calculated by 

substituting x,', for jc,, in Equations (6) and (7). Then, the value of the objective function with respect 
to different presumed source locations can be calculated by substituting different groups of z^J e [1,/] 
and the corresponding c',/e[l,/] to Equation (13). The logarithmic transform of these values are 

illustrated in Figure 2. According to Figure 2, a global minimum appears near the actual source 
location, while there are also several local minima and saddle points at other locations in the square. 
To generate Figure 5, the measurements of 30 participating scattered sensor nodes, i.e., z,,/ e [1,30] 

were substituted for c in Equation (14). 

4.2. Demonstrating the Process of Distributed Averaging 

As mentioned in Section 3, the convex combination of local estimates is calculated by using a 
distributed average consensus algorithm, which serves as the prerequisite of the distributed 
implementation of our method. Thus, before assessing the locaUzation performance of our method, it is 
necessary to demonstrate the process of distributed averaging. 

At the k-th iteration of localization, the coordinates of the local estimates JC;(^),/ e [1,/] were 

convexly combined to form the global estimate, i.e., x^ik) , which were considered as the initial 
values, i.e., = JC;(^),/ e[l, /] . Then, these values were updated by using the values from the 

neighboring nodes at each iteration of averaging. The abscissas of the local estimations, i.e., 
x-{k),i e [1,/], at different iterations of localization were illustrated in Figure 7. As shown in Figure 7, 

the distributed averaging of x.{k),i & \X,l} was performed eight times, which were associated with 
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eight iterations of localization. At each time of the averaging, the abscissas maintained by different 
participating scattered sensor nodes converged to the abscissa of the global estimate, i.e., 
Xf^{k),k e[l,8] , in about ten iterations of consensus. Moreover, the abscissas of these global estimates 

converged near the abscissa of the actual chemical source which was located at (300,4000) cm. 

Figure 7. The variation of the abscissas of local estimates in eight successive iterations of 
localization. To get a clear illustration, the sector surrounded by the black rectangle in 
(a) is magnified and displayed in (b). 




Distributed averaging iterations Distributed averaging iterations 

(a) (b) 



4.3. The Influence of the Initial Point 

The trust-region-reflective (TRR) algorithm [24], which is recommended in Matlab, was used to 
solve the standard centralized nonlinear LSE of the source location and as the benchmark to assess the 
performance of our method. The TRR algorithm, which is based on the interior-reflective Newton 
method, is a subspace trust-region method. Trust region approaches approximate the function to be 
minimized with a simpler function, which can reasonably reflect the behavior of the original function 
in a neighborhood around the current point, i.e., the trust region. Then, the trust-region sub-problem, 
which minimizes the simpler substituted function over the trust region, is solved at each iteration of 
trust region approaches. The trust-region methods are local search methods since the solution can only 
move in the trust regions. Thus, TRR is a typical local search algorithm, although the structure of the 
nonlinear LSE problem is exploited in TRR to enhance efficiency. 

The influence of initial point on the localization performance of TRR and DLSE were assessed in 
two different cases, and the results are illustrated in Figure 8. Note that in Figure 8, the initial points 
are directly connected to the associated final estimates for TRR, while the intermittent global estimates 
in a single localization were connected to each other for DLSE. As shown in Figure 8, TRR succeeded 
several times when the initial point was near the source location, otherwise, it would be stagnated at 
some of the local optima or failed in starting the search. Contrarily, DLSE succeeded from all of the 
initial points in the two cases, and achieved relatively small localization errors compared to the large 
size of the node deployment area. Therefore, the locations of initial points hardly influence the 
localization performance of DLSE. In other words, DLSE can overcome the problem of poor 
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convergence caused by the multiple local minima and saddle points of the objective function in 
Equation (13), which would obstruct the convergence of TRR and other local search algorithms. 



Figure 8. The influence of initial point on the localization performance of TRR and DLSE. 
The initial points, sensor nodes and the final global estimates are denoted as circles, crosses 
and triangles, respectively. The source is located at (300,4000) cm in (a) and (c), while it is 
located at (-4000,4000) cm in (b) and (d). (a) and (b) show the localization results of TRR, 
while (c) and (d) illustrate the variation of global estimates in the process of performing 
DLSE. (e) and (f) are the magnified versions of a small rectangular area around the source 
in (c) and (d), respectively. 
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4.4. The Performance of Locating a Remote Chemical Source 

As shown in Figure 8d, DLSE succeeded in locating a remote chemical source which was located 
far away from the node deployment area. However, the results presented in Figure 8d was obtained 
based on limited times of simulation of locating the same source. To get a more convictive assessment 
of the performance of locating a remote chemical source, the actual chemical source was located at 
different locations far away from the node deployment area. For each location of the chemical source, 
100 times of simulations were conducted, with the measurements collected by the nodes which were 
randomly re-deployed at the beginning of each simulation. TRR was not considered in these cases due 
to its poor convergence even when the actual chemical source was located in the node deployment area 
as shown in Figure 8a. Since the initial point can hardly influence the performance of DLSE, it was set 
as the centroid of the node deployment area, i.e., (4000,4000) cm, in all of these simulations. The 
statistical results of these simulations were represented with error bars in Figure 9. 

As shown in Figure 9, all of the chemical sources were successfully located with relatively small 
localization errors compared with the large sensor-source distances. The localization errors became 
larger with an increase in the distance between the source and the node deployment area. Because the 
theoretical concentration is inversely proportional to the sensor-source distance when other variables 
are kept constant according to Equation (11), the SNR of the measurements decreased, which would 
influence the localization performance, as the distance between the source and the node deployment 
area increased. However, the performance of any localization method based on concentration 
measurements might be influenced by the SNR of the measurements. 

Figure 9. The statistical performance of locating a remote chemical source using DLSE. 
All the ordinates of these chemical sources were set as 4000 cm, thus, only the abscissas of 
the sources were displayed to denote the associated sources. The center and half-length of 
the error bar are the mean and standard deviation of the localization errors, respectively. 
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5. Real Experiment Results 

In addition to the simulations, our methods were assessed in realistic experiments using a WSN 
which consists of twenty five real sensor nodes. 
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5.1. The Sensor Nodes and Realistic Environment 

The sensor node was constructed based on the C51RF-CC2431 module (Wireless Dragon, Co. Ltd., 
Chengdu, China). Each of the nodes was equipped with a MiCS-5521 (SGX sensor Technology, Co. 
Ltd. Neuchatel, Switzerland) gas sensor to measure the chemical concentration. The nodes can 
communicate with the sink node via the wireless ZigBee protocol. The sensor nodes are able to 
reliably transmit the real-time sensing voltages of MiCS-5521 to the sink node. Due to the limitation of 
our currently available nodes, the wind speed was measured by two additional WindSonic 
anemometers (Gill, Co. Ltd., Hampshire, England). 

Figure 10. The general picture of the experimental installations. The bubbler was kept in a 
thermostatic bath so as to keep pressure inside the bubbler at a constant saturated vapor 
pressure. The pump and the flow controller were used to blow the saturated ethanol vapor 
out from the bubbler into the wind tunnel through a long PVC pipe. 




The general picture of the experimental installations is shown in Figure 10. A reduced scale wind 
tunnel, which is a cuboid with 300 x 400 cm area by 90 cm height, was built up to create an 
approximately homogeneous wind field. The inlet and outlet of the wind tunnel are two parallel sides 
of the cuboid. An array of electrical fans was integrated in a flat rack, which was mounted in the wind 
inlet. Clean air was blown into the tunnel through the wind inlet. More details about the wind tunnel 
can be found in [25]. A pump and a flow controller were used to blow saturated ethanol vapor from the 
bubbler into the wind tunnel through a long PVC tube. The density of the saturated ethanol vapor in 
the bubbler was maintained by a thermostatic bath. The saturated vapor pressure and density can be 
calculated using Antoine's equation and the Clausius-Clapeyron's equation, respectively. Then, the 
product of the volume flow rate and the density of the saturated ethanol vapor can be considered as the 
theoretical release rate of the chemical source. 

5.2. Sensor Calibration 

Before the sensor readings can be used to represent the concentration, the sensors should be 
calibrated to find an accurate mapping from the sensor readings to the concentration. Due to the 
controlled homogenous wind field, the concentration distribution and the sensor readings are 
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approximately steady after a period of time [4]. Since this specific work condition approaches the 
closed sampling space [26], during the calibration process the sensor nodes were enclosed inside a 
closed glass box whose cubage is 108 L, as shown in Figure 11. The box was placed in the wind tunnel 
to make the sensors work under the same humidity and temperature like those in the process of 
localization. Then, multiple calibration points can be created by dropping in certain volumes of 
absolute ethyl ethanol and blowing them to accelerate their evaporation with convective wind. The 
relationship between the sensitivity 5, i.e., the ratio of the sensing resistance to the baseline resistance, 
and the concentration Cs (in ppm) is well fitted to the transform relationship log 5 = a-logC, + b 

the range e [10,1000] ppm [27]. Thus, the unknown parameters, i.e., a and b, can be identified by 

fitting the steady sensing voltages and the associated actual concentrations to the calibration curve. 
Note that we refer to the final steady concentration, which was calculated by substituting the 
approximately steady voltage to the identified transform relationship, as the measurement in the rest of 
this section. 

Figure 11. The experiment setup for calibrating the chemical sensors on the nodes. 
Different concentrations were created by dropping associated volumes of absolute ethanol 
into the glass box through the ethanol inlet. Both the ethanol inlet and the node inlet were 
sealed up during the calibration. 




5.3. Identification of the Release Rate and the Eddy Diffusivity 

According to Equation (8), when a = -n , which means the sensor node lies on the positive 
abscissa axis of the Cartesian coordinate system in Figure lb, the concentration of the node is: 

_ q _ \ J_ 

where x — qI^ be considered as a constant parameter during the process of localization. 

However, the premise is that the value of q and K are identified before the localization. Suppose a 
standard chemical source, which was releasing at a controlled release rate = 0.01 g/s, was placed in 

the wind tunnel and taken as the origin of a Cartesian coordinate system like the one in Figure lb. All 
the sensor nodes were placed along the positive abscissa axis. The abscissas of these nodes were set on 
the premise that the reciprocals of the distances from these nodes to the standard source should range 
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from 0.003 to 0.008 in increments of 0.0005. The relationship of the concentration measurements of 
these nodes and the reciprocals of the associated distances was fitted into a straight line, which was 
shown in Figure 12a. Then, the value of ^/2^'2'o - identified as the gradient of the fitted 

line, and K can be calculated by dividing g„ by the identified value of . According to Figure 12a, 
the function of the fitted line is c = 6002A/d +0.62, and the identified value of K is 103.15 cm^/s. 

Note that in real cases, the source to be located and the standard source may release simultaneously, so 
the concentration measurements used to identify K should be the increments of the measurements 
caused by the standard source. 



Figure 12. The experiment results of identifying q and K. (a) the measurements collected 
along the centerline of the plume were fitted to a line, of which the gradient can be 

considered as ^^^^o^/^ (\^^ (j^g measurements along the line vertical to the plume 

centerline. The mean and standard deviation of these measurements were considered as the 
parameters for generating the normal distribution. 
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With the identified value of K, we can further identify q by measuring the concentration along the 
centerline of the plume. The first step is to determine the unknown centerline of the plume caused by 
the source to be located. Because the distribution of the concentration measurements collected along 
the line perpendicular to the wind direction satisfies the normal law of errors [15], the unknown 
centerline can be determined as the line passing the maximum concentration point along the downwind 
direction. To verify this normal distribution, the twenty five sensor nodes were placed along a line 
vertical to the plume centerline. The measurements of these sensor nodes were plotted in Figure 12b. 
Since the measurements on the centerline of the plume caused by the source to be located satisfy 
d = ^/l s , r = ^/v , the value of can be determined as the gradient of the line fitting the 

relationship between d and j/ . However, the distance between the sensor node and the unknown 
source, i.e., d, was unknown. The gradient of the fitted line can only be approximately determined as 



when AJ is relatively small. Thus, fifteen sensor nodes were placed with equal spacing 



A<i = 20 cm on the centerline of the plume caused by the source to be located. The noisy concentration 
measurements of these sensor nodes, i.e., z,,/e[l,15] , and the associated value of 
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A(l/z) = 1/ Zf -yZi_^ ,i e [2,15] were shown in Table 1. In this experiment, the flow rate of the bubbler 

and the temperature of the bath were kept at 600 seem (mL/min) and 60 °C, respectively. As 

mentioned in Section 5.1, the calculated theoretical release rate was 0.0078 g/s. As shown in Table 1, 
the mean of the values of A(l/z) = 1/ Zj - 1/ z, i , i e [2, 15] is 0.0037 ppm~\ Then, the value of q can be 

identified as 0.0071 g/s, which is close to the calculated theoretical release rate. 

Table 1. The concentration measurements at the locations with equal spacing Ad = 20 cm 
on the centerline of the plume caused by the source to be located. 



ID 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 




54.89 


49.58 


41.03 


3348 


35.16 


31.15 


27.68 


21.52 


26.53 


22.67 


19.39 


19.03 


17.11 


11.91 


15.17 




NA 


0.0019 


0.0042 


0.0055 


0.0014 


0.0036 


0.0040 


0.0103 


0.0088 


0.0064 


0.0075 


0.0009 


0.0059 


0.0255 


0.0180 



5.4. Experiment Results 

After the above-mentioned preliminaries, the performance of DLSE and TRR were assessed with 
real measurements. Note that, to avoid the complicated programming with the Zigbee protocol, the 
concentration measurements were transmitted to the workstation and the distributed implementation of 
DLSE was realized off-line. However, the online distributed implementation of DLSE is applicable 
after a further study on hardware design and embedded programming, which was not focused in this 
paper. Four scenarios were created by changing the flow rate of the bubbler from 400 seem to 
1000 seem in increments of 200 seem, while keeping the temperature of the thermostatic bath, the 
wind speed at 60 °C, 70 cm/s, respectively. The concentration threshold is set as 10 ppm to eliminate 
the measurements with low SNRs. Based on the measurements collected in each of the groups, one 
hundred of tests with different randomly selected initial points were conducted using TRR and DLSE. 
The boxplot of the localization errors in these groups were shown in Figure 13. 

Figure 13. Boxplot of the localization errors in the four groups of tests. In both 
sub-figures, the maximum whisker lengths of the boxplots are set as 1. (a) The localization 
errors of TRR; (b) The localization errors of DLSE. 
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As shown in Figure 13a, although the minimum localization errors obtained using TRR were 
comparatively small, their overall distribution was highly fragmented and not symmetrical with respect 
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to the median. Thus, the good performance of TRR is not guaranteed if the initial point is not properly 
chosen. In Figure 13b, DLSE succeeded in all of the tests with considerably high accuracy. The 
localization errors obtained based on the same group of measurements were generally symmetrically 
distributed with respect to the associated medians. In addition, higher flow rate of the bubbler, which 
means larger release rate of the chemical source and more participating scattered sensor nodes, results 
in smaller mean of localization errors. 

6. Conclusions 

In this paper, a distributed chemical source localization method is proposed for a WSN. Aiming to 
minimize the summands in the objective function of LSE-based CSL using WSNs, i.e., the sum of 
squared measurement errors, individual local estimations of the source location are conducted on the 
participating scattered sensor nodes. This scheme avoids gathering the raw measurements and enables 
the realization of a distributed estimation method. The local estimates are convexly combined into the 
global estimate, which is used as the prior information for the local estimation at the next iteration of 
localization. The global estimate includes the information of all the local estimates in different degrees 
and enables collaboration among all the participating scattered sensor nodes. Our method shows a 
global convergence property with fast convergence rate, even when the actual chemical source is far 
away from the deployment area of the WSN. Thus, the initial point of our method can be preset as a 
random point before the process of measurement. 

Our method iteratively incorporates the global information about the source location in a distributed 
manner so as to achieve a global convergence property. When the distribution models of radioactive 
substance and acoustic energy are utilized, our method can also be adapted to locate the radioactive 
and acoustic sources. Future works may cover deriving a theoretical probability of being the real 
chemical source for the local estimates. When such a probability is used as the weight of the associated 
local estimate, our method could evolve into a distributed particle filter. 
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